A Study of Academic Collaboration in Computational Linguistics with Latent Mixtures of Authors

نویسندگان

  • Nikhil Johri
  • Daniel Ramage
  • Daniel A. McFarland
  • Daniel Jurafsky
چکیده

Academic collaboration has often been at the forefront of scientific progress, whether amongst prominent established researchers, or between students and advisors. We suggest a theory of the different types of academic collaboration, and use topic models to computationally identify these in Computational Linguistics literature. A set of author-specific topics are learnt over the ACL corpus, which ranges from 1965 to 2009. The models are trained on a per year basis, whereby only papers published up until a given year are used to learn that year’s author topics. To determine the collaborative properties of papers, we use, as a metric, a function of the cosine similarity score between a paper’s term vector and each author’s topic signature in the year preceding the paper’s publication. We apply this metric to examine questions on the nature of collaborations in Computational Linguistics research, finding that significant variations exist in the way people collaborate within different subfields.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Academic Collaborations in Computational Linguistics using a Latent Mixture of Authors Model

Academic collaboration has often been at the forefront of scientific progress, whether amongst prominent established researchers, or between students and advisors. We suggest a theory of the different types of academic collaboration, and use topic models to computationally identify these in Computational Linguistics literature. A set of author-specific topics are learnt over the ACL corpus, whi...

متن کامل

A Labeled LDA Approach to Understanding the Dynamics of Collaboration

We propose a topic modeling approach to understand the nature of academic collaborations between individuals. Specifically, we use Labeled LDA (Ramage et al., 2009), a variation of the popular topic model Latent Dirichlet Allocation (Blei et al., 2003), to train a set of author-specific topics over the ACL corpus. The ACL corpus ranges from 1965 to 2009, and we train a separate topic model for ...

متن کامل

A Model of Authors’ Generic Competence of EAP Research Articles: A Qualitative Meta-Synthesis Approach

Genre analysis as an area of great concern in recent decades, involves the observation of linguistic features used by a determined discourse community. The research article (RA) is one of the most widely researched genres in academic writing which is realized through some rhetorical moves and discursive steps to achieve a communicative purpose. This study aimed at proposing a model of generic p...

متن کامل

A Comparative Study of Metadiscourse in Academic Writing: Male vs. Female Authors of Research Articles in Applied Linguistics

Like conversation and other modes of communication, writing is a rich medium for gender performance. In fact, writing functions to construct the disciplines as well as the gender of its practitioners. Despite the significance of author gender, as one constitutive dimension of any writing, it has been relatively under-researched. One way, by means of which author gender is practiced, and reveale...

متن کامل

A Comparative Study of Lexical Bundles in Soft Science Articles Written by Native and Iranian Authors

Writing academic texts by novice researchers requires a framework and support by learning how to cite the works of others. However, compared to the studies on other academic writings, studying citations by considering certainty markers has received little attention. The main purpose of this study was to investigate the shifts of certainty markers (hedges and boosters) in pre- and post-citation ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011